7 research outputs found
HexaGAN: Generative Adversarial Nets for Real World Classification
Most deep learning classification studies assume clean data. However, when
dealing with the real world data, we encounter three problems such as 1)
missing data, 2) class imbalance, and 3) missing label problems. These problems
undermine the performance of a classifier. Various preprocessing techniques
have been proposed to mitigate one of these problems, but an algorithm that
assumes and resolves all three problems together has not been proposed yet. In
this paper, we propose HexaGAN, a generative adversarial network framework that
shows promising classification performance for all three problems. We interpret
the three problems from a single perspective to solve them jointly. To enable
this, the framework consists of six components, which interact with each other.
We also devise novel loss functions corresponding to the architecture. The
designed loss functions allow us to achieve state-of-the-art imputation
performance, with up to a 14% improvement, and to generate high-quality
class-conditional data. We evaluate the classification performance (F1-score)
of the proposed method with 20% missingness and confirm up to a 5% improvement
in comparison with the performance of combinations of state-of-the-art methods.Comment: Accepted to ICML 201
How Generative Adversarial Networks and Their Variants Work: An Overview
Generative Adversarial Networks (GAN) have received wide attention in the
machine learning field for their potential to learn high-dimensional, complex
real data distribution. Specifically, they do not rely on any assumptions about
the distribution and can generate real-like samples from latent space in a
simple manner. This powerful property leads GAN to be applied to various
applications such as image synthesis, image attribute editing, image
translation, domain adaptation and other academic fields. In this paper, we aim
to discuss the details of GAN for those readers who are familiar with, but do
not comprehend GAN deeply or who wish to view GAN from various perspectives. In
addition, we explain how GAN operates and the fundamental meaning of various
objective functions that have been suggested recently. We then focus on how the
GAN can be combined with an autoencoder framework. Finally, we enumerate the
GAN variants that are applied to various tasks and other fields for those who
are interested in exploiting GAN for their research.Comment: 41 pages, 16 figures, Published in ACM Computing Surveys (CSUR
Adversarial Training for Disease Prediction from Electronic Health Records with Missing Data
Electronic health records (EHRs) have contributed to the computerization of
patient records and can thus be used not only for efficient and systematic
medical services, but also for research on biomedical data science. However,
there are many missing values in EHRs when provided in matrix form, which is an
important issue in many biomedical EHR applications. In this paper, we propose
a two-stage framework that includes missing data imputation and disease
prediction to address the missing data problem in EHRs. We compared the disease
prediction performance of generative adversarial networks (GANs) and
conventional learning algorithms in combination with missing data prediction
methods. As a result, we obtained a level of accuracy of 0.9777, sensitivity of
0.9521, specificity of 0.9925, area under the receiver operating characteristic
curve (AUC-ROC) of 0.9889, and F-score of 0.9688 with a stacked autoencoder as
the missing data prediction method and an auxiliary classifier GAN (AC-GAN) as
the disease prediction method. The comparison results show that a combination
of a stacked autoencoder and an AC-GAN significantly outperforms other existing
approaches. Our results suggest that the proposed framework is more robust for
disease prediction from EHRs with missing data.Comment: 10 pages, 4 figure
Polyphonic Music Generation with Sequence Generative Adversarial Networks
We propose an application of sequence generative adversarial networks
(SeqGAN), which are generative adversarial networks for discrete sequence
generation, for creating polyphonic musical sequences. Instead of a monophonic
melody generation suggested in the original work, we present an efficient
representation of a polyphony MIDI file that simultaneously captures chords and
melodies with dynamic timings. The proposed method condenses duration, octaves,
and keys of both melodies and chords into a single word vector representation,
and recurrent neural networks learn to predict distributions of sequences from
the embedded musical word space. We experiment with the original method and the
least squares method to the discriminator, which is known to stabilize the
training of GANs. The network can create sequences that are musically coherent
and shows an improved quantitative and qualitative measures. We also report
that careful optimization of reinforcement learning signals of the model is
crucial for general application of the model.Comment: 8 pages, 3 figures, 3 table
Deep Trustworthy Knowledge Tracing
Knowledge tracing (KT), a key component of an intelligent tutoring system, is
a machine learning technique that estimates the mastery level of a student
based on his/her past performance. The objective of KT is to predict a
student's response to the next question. Compared with traditional KT models,
deep learning-based KT (DLKT) models show better predictive performance because
of the representation power of deep neural networks. Various methods have been
proposed to improve the performance of DLKT, but few studies have been
conducted on the reliability of DLKT. In this work, we claim that the existing
DLKTs are not reliable in real education environments. To substantiate the
claim, we show limitations of DLKT from various perspectives such as knowledge
state update failure, catastrophic forgetting, and non-interpretability. We
then propose a novel regularization to address these problems. The proposed
method allows us to achieve trustworthy DLKT. In addition, the proposed model
which is trained on scenarios with forgetting can also be easily extended to
scenarios without forgetting
Stein Latent Optimization for Generative Adversarial Networks
Generative adversarial networks (GANs) with clustered latent spaces can
perform conditional generation in a completely unsupervised manner. In the real
world, the salient attributes of unlabeled data can be imbalanced. However,
existing unsupervised conditional GANs cannot cluster attributes of these data
in their latent spaces properly because they assume uniform distributions of
the attributes. To address this problem, we theoretically derive Stein latent
optimization that provides reparameterizable gradient estimations of the latent
distribution parameters assuming a Gaussian mixture prior in a continuous
latent space. Structurally, we introduce an encoder network and novel
unsupervised conditional contrastive loss to ensure that data generated from a
single mixture component represent a single attribute. We confirm that the
proposed method, named Stein Latent Optimization for GANs (SLOGAN),
successfully learns balanced or imbalanced attributes and achieves
state-of-the-art unsupervised conditional generation performance even in the
absence of attribute information (e.g., the imbalance ratio). Moreover, we
demonstrate that the attributes to be learned can be manipulated using a small
amount of probe data
Reinforcement Learning based Recommender System using Biclustering Technique
A recommender system aims to recommend items that a user is interested in
among many items. The need for the recommender system has been expanded by the
information explosion. Various approaches have been suggested for providing
meaningful recommendations to users. One of the proposed approaches is to
consider a recommender system as a Markov decision process (MDP) problem and
try to solve it using reinforcement learning (RL). However, existing RL-based
methods have an obvious drawback. To solve an MDP in a recommender system, they
encountered a problem with the large number of discrete actions that bring RL
to a larger class of problems. In this paper, we propose a novel RL-based
recommender system. We formulate a recommender system as a gridworld game by
using a biclustering technique that can reduce the state and action space
significantly. Using biclustering not only reduces space but also improves the
recommendation quality effectively handling the cold-start problem. In
addition, our approach can provide users with some explanation why the system
recommends certain items. Lastly, we examine the proposed algorithm on a
real-world dataset and achieve a better performance than the widely used
recommendation algorithm.Comment: 4 pages, 2 figures, IFUP2018(WSDM 2018 workshop